home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
CU Amiga Super CD-ROM 25
/
CU Amiga Magazine's Super CD-ROM 25 (1998)(EMAP Images)(GB)(Track 1 of 2)[!][issue 1998-08].iso
/
CUCD
/
Programming
/
PPCpack
/
c.txt
next >
Wrap
Text File
|
1998-05-26
|
18KB
|
507 lines
B) PowerPC Support in C or C++
==============================
Principially PPC Developpement in C/C++ runs in 5 phases:
Note: If you are using vbcc-WarpOS, and not StormC, then you should
also read vbcc.doc !!! Coding with vbcc differs in some parts to
coding with StormC !!!
1) Rewrite all 68k ASM Stuff in C
2) Adapt Source to ANSI/StormC
3) Adapt to PPC
4) Contextswitch-Optimizing
5) Further Adaptions
Contrary to what you might believe, 3) is only a very small step,
the big step is 2). And yes, you can do this already, even if you
do not own a PPC, mainly. I will explain the different steps of
developpement now in a more detailed way.
It has to be outlined, that it is advised to do steps 1)/2) already
while developping an 68k version, even if at first no PPC Version is
planned. It will simplify the PPC Developpement much, and it in fact
does not need too much extra work...
It has also to be noted, that things are not that easy using the PPC
Software from Phase 5. This is a special feature of the WarpOS Software,
that things can be such easy.
I won't discuss rewriting 68k ASM to C Source here, you should be able
to do this yourselves.
2) Adapt Source to ANSI/StormC
------------------------------
The most work is not the adaption to PPC, but the adaption from SAS/C or
GNU C to StormC. StormC is a strict ANSI compiler, because of that it
knows only Standard-C-Functions that are contained in the ANSI-Standard.
Some of the not-supported functions can be emulated using the not-yet-released
UnixLib, though.
It should be noted, that, if your program compiles on SAS/C with the
STRICT ANSI mode set. You can think of StormC as a compiler that ALWAYS
runs in STRICT ANSI mode.
The following SAS/C Functions are not contained in ANSI, and thus not
supported by StormC (most of them are quite exotic functions, and it is
possible that you do not even know a lot of them, even if you are a proficient
C Coder) :
astcsma isascii iscsym iscsymf toascii scdir stcpm
stcpma stcsma stccpy stpcpy stcis stcisn stclen
stpbrk stpchr stpchrn strcmpi strnset
strset stcarg stpsym stptok stpblk strbpl strdup
strins strmid stcd_i stcd_l ecvt fcvt gcvt
stch_i stch_l stci_d stci_h stci_o stcl_d stcl_h
stcl_o stco_i stco_l stcu_d stcul_d toascii stpdate
stptime __datecvt __timecvt utpack utunpk cot iabs
max min pow2 __emit getreg putreg geta
isatty ovlyMgr dqsort fqsort lqsort sqsort strsrt
tqsort drand48 erand48 jrand8 lcong48 lrand48 mrand8
nrand48 seed48 srand48 __autoopenfail chkabort Chk_Abort
_CXBRK __exit onexit _XCEXIT forkl forkv onbreak
wait waitm bldmem rstmem sizmem chkml getmem
getml halloc lsbrk sbrk _MemCleanup rbrk rlsmem
rlsml memccpy movmem repmem setmem swmem except
__matherr poserr datecmp timer __tzset getch fgetchar
fputchar _dread _dwrite read write clrerr close
_dclose fcloseall creat _dcreat _dcreatx fdopen fileno
fmode iomode open _dopen flushall mkstemp mktemp
setnbf _dseek lseek tell access chkufb chmod
fstat getfa getft stat stcgfe stcgfn stcgfp
strmfe strmfn strmfp strsfn unlink argopt chgclk
dos_packet getclk getasn getdfs putenv rawcon stackavail
stacksize stackused chdir closedir dfind dnext findpath
getcd getcwd getfnl getpath mkdir opendir readdir
rmdir seekdir rewinddir telldir readlocale scr_beep scr_bs
scr_cdelete scr_cinsert scr_clear scr_cr scr_curs scr_cursrt scr_cursup
scr_eol scl_home scr_ldelete scr_lf scr_linsert scr_tab _CXFERR
_CXOVF _EPILOG _PROLOG
The most important of the "not allowed" functions are the Level 0 I/O functions
(open,close,read,write). Use fopen,fclose,fread,fwrite instead.
Note: Some of these functions might be included, in the first version of this text
i by mistake declared stricmp and strnicmp as not included (what is wrong), there
might be more errors in the list :) But probably not many... probably none...
But STRICT ANSI does not only limit the functions, there are also some things,
that cause a warning from SAS/C, but an error from a strict ANSI Compiler.
Things like:
char *string=malloc(300);
cause an error from StormC. Correct would be:
char *string=(char *)malloc(300);
ANSI wants STRONG TYPING. If you do not own StormC, but want to make your code
as easy compilable with StormC PPC later, compile with STRICT ANSI. Problems
appear especially with function pointers. If you are not sure how to cast
a thing for STRICT ANSI, maybe you should try void *, it works often for
not strongly typed source.
You should also replace all K&R Syntax (example)
void main(argv,argc)
int argv;
char **argc;
by the normal syntax (example)
void main(int argv,char **argc);
Also a code like
int a=5;
int stuff[a];
is not legal on ANSI. Array Dimensions have to be constants.
If you need them variable, use dynamic allocation using malloc.
A good method to convert to "Strict ANSI" is the following:
1. Just compile it, and look at every warning and error
2. Typecast everything that looks like a pointer (and causes
an error) to void *, everything else that causes problems,
to a int, long or double.
3. If some things still don't work, have a look at them now.
Some Sources (like the Source of Doom) require parts of the
Unix/TCP includes. If you need such things, please contact me,
i have converted the needed things to StormC (contact address
see below).
Now we are nearly done with the ANSI/StormC Adaption. At the end
some keyword have to be defined differently:
#define __stdargs
#define __regargs
#define __asm
#define __far FAR
#define __inline inline
#define __volatile volatile
__chip, __fast and __interrupt do not exist on StormC, they have to
be replaced by the appropriate OS Functions. Some programmers also
use some strange cominations that won't work (static inline is complete
nonsese, get it Unix-coders :) !!! Static OR inline but not both of them !!!)
And if we are at "bad coding style": Bitfields only exist on C++, not
in ANSI C...
Ah, and one word to those fclose-always-works-fans. No, fclose does
not work, if the file is NOT OPEN !!! You crash your task, if you try
to close a file, that is not open.
Do
if (file) fclose(file);
Some words to __attribute__ ((packed)). It does not exist, and is a
feature that would slow down the PPC *much*, if it would exist. Please
do not use __attribute ((packed)). The PPC needs a certain alignment
to get optimal speed.
About Text Constants longer than a line:
It is legal to write:
char *bla="...."\
"...."\
"....";
But the last character before the \ should be a \ here.
The notation
char bla[]={"..."\
"..."};
is not legal (This is sometimes used in GNU C Sources).
If you have done all this, you now (should) have a working StormC 68k Source.
Now we go to the PPC stuff. The most work is done now. Only small things
remain to do. PPC-handling is mostly done internal by the compiler.
C. Adapt to PPC
---------------
At first we have to change register parameters:
void test(register __a0 mytest);
has to be changed (for example) to
void test(register mytest);
The PPC does not know a register a0. But you can tell him to use a
register by usage of the keyword "register", without specifying a
register number.
Next we have to do some changes to OS-Includes:
up to now, depending on which compiler you used, you did (example):
#include <clib/exec_protos.h>
#include <pragmas/exec_pragmas.h>
or
#include <clib/exec_protos.h>
#include <pragma/exec_lib.h>
or
#include <clib/exec_protos.h>
#include <inline/exec.h>
or
#include <proto/exec.h>
For StormC PPC you do:
#include <clib/exec_protos.h>
Do not include any pragmas/pragma files, or you will be swamped by error-messages.
Also do not include any proto/ files.
If you want to compile your source for both 68k and PPC (without changing the
source) you do:
#include <clib/exec_protos.h>
#ifndef __PPC__
#include <pragma/exec_lib.h>
#endif
__PPC__ is always set correctly.
Yet another difference between 68k and PPC concerns the usage of Subtasks. If you
want to do the Subtask as PPC Task (recommended) you have to replace functions like
CreateTask() by CreateTaskPPC() of the powerpc.library. I won't go into detail here,
most of the time the API is absolutely identic to the usual functions, with the
exception of a PPC at the end of the function name. Read the documentation of
WarpOS for more information.
The other method would be doing the subtask as 68k task and calling CreateTask().
To do so you would have to make your program a mixed Binary, though, and you also
would not get full PPC Speedup. So usually (unless the subtask does many OS Calls)
the CreateTaskPPC() approach is the better method. Also, it is recommended not to
use 68k Subtasks in PPC programs, so that your program will get optimal speed
on a 100% PPC Amiga System (that surely will appear some time in the future).
Earlier versions of the compiler had problems with Tags-versions of OS-functions.
This is fixed since quite some time now. I did not notice, that is why i said
in earlier versions of this document, that you would have to change this code.
I did not test since quite some time.
Then we come to the BeginIO-Function. This function only exists with a
Library Base on the PPC Compiler. You can use the following code (example
is for audio.device):
#include <libraries/powerpc.h>
#include <ppcamiga.h>
void BeginIOAudioPPC(struct IORequest *arg1)
{
extern struct Library *AudioBase;
ULONG regs[16];
regs[9] = (ULONG) arg1;
__CallLibrary(AudioBase,-30,regs);
}
An example how this can be used (out of the Sound-Code of ZhaDoom...):
AudioBase = (struct Library *)audio_io->ioa_Request.io_Device;
c = &channel_info[cnum];
c->audio_io->ioa_Request.io_Command = CMD_WRITE;
c->audio_io->ioa_Request.io_Flags = ADIOF_PERVOL;
c->audio_io->ioa_Data = &chip_cache_info[cache_chip_data (id)].chip_data[8];
c->audio_io->ioa_Length = lengths[id] - 8;
c->audio_io->ioa_Period = period_table[pitch];
c->audio_io->ioa_Volume = vol << 2;
c->audio_io->ioa_Cycles = 1;
#ifdef __PPC__
BeginIOAudioPPC((struct IORequest *)c->audio_io);
#else
BeginIO ((struct IORequest *)c->audio_io);
#endif
You see? You always have to read out the LibraryBase of a device to do
a BeginIO on PPC...
Some readers now probably ask themselves what about the famous
"Context-Switch". Well, the truth is, under StormC, the Compiler
automatically deals with the Contextswitch. You won't have to think
about it... i will lose some words about it anyways:
There are two sorts of Contextswitches:
a) Function-Contextswitches
You have to compile with Debugging-Information the first time you compile
the Source. Then the compiler handles the Contextswitches automatically.
Later you can compile without Debugging-Information, if you want.
b) Library-Contextswitches
These need so-called "function-stubs". ppcamiga.lib already contains
the function-stubs for all Amiga-OS-functions, and for the 68k-functions
of rtgmaster (But for rtgmaster also PPC-functions exist, and it is
adviced to use these). To create a stub for a not yet supported library,
you do:
genppcstub mylib_protos.h mylib.fd VERBOSE
You need the proto- and the FD-File to create the stub. The stub is a
C Source file that you link together with your Source. The Contextswitch
itselves then works automatically.
D.) Contextswitch-Optimizing
----------------------------
With WarpOS a Contextswitch needs about 0.5 milliseconds (with a 200 MHz
PPC 604e Board...). It should be avoided to do "many Contextswitches
per Second" (BTW: The Phase 5 Software needs about 1 millisecond for a
Contextswitch).
Example of things to avoid:
- Load Files on a Byte-per-Byte basis with fgetc (use fread instead
and load to a Fastram Buffer, from which you get the stuff on a
Byte-Per-Byte-Basis then)
- WritePixel (work on a Fastram-Buffer instead)
- OS-Calls that are called often per second
Graphics can be handled completely PPC Native by using rtgmaster.
rtgmaster is a PPC Shared Library.
Notice, that some of the Standard-C-Functions do Contextswitches.
I think clock() is among them, but am not sure about it. A possibility
to deal timing without Contextswitches for sure is to use the PPC
timer directly, in PPC ASM:
double tb_scale_lo = ((double)(bus_clock >> 2)) / 35.0;
double tb_scale_hi = (4.294967296E9 / (double)(bus_clock >> 2)) * 35.0;
bus_clock is set to the Bus Clock in Hz, for example 50000000 for
a 150 MHz Board, 66000000 for a 200 MHz Board.
Stopping time is then done like this (example of the I_GetTime-function
of Doom):
int I_GetTime (void)
{
unsigned int clock[2];
double currtics;
static double basetics=0.0;
ppctimer (clock);
if (basetics == 0.0)
basetics = ((double) clock[0])*tb_scale_hi + ((double) clock[1])/tb_scale_lo;
currtics = ((double) clock[0])*tb_scale_hi + ((double) clock[1])/tb_scale_lo;
return (int) (currtics-basetics);
}
ppctimer looks like (object code for people who do not have StormPowerASM
is contained inside this archive):
vea
XDEF _ppctimer
_ppctimer: mftbu r4
mftbl r5
mftbu r6
cmpw r4,r6
bne _ppctimer
stw r4,0(r3)
stw r5,4(r3)
blr
But well, as i said, i am not sure, if clock() does use Contextswitches or not.
Only i had the feeling that ZhaDoom speed up, after i replaced the usage of clock()
by the usage of ppctimer().
5) Further Adaptions
--------------------
Note: The following is fully optional !!! (But it might speed up some things)
It is possible to declare waste memory-areas as non-cachable using the BAT-registers
of the PPC. How this is exactly done, read the documentation of WarpOS.
Another optimization would be re-writing parts of the code in PPC Assembler.
As to this, see below.
In some newsgroups it was discussed to run program parts asynchronely on the
68k. Some people even claimed this would only be possible with the Phase 5
software. This is not true, if you want to implement it, you would use the
PPC-Native Message-System of WarpOS (keyword "AllocXMsg", refer to WarpOS
documentation). But i want to outline the disadvantages of this "parallel"
method:
1) On PPC-only machines such code would have serious disadvantages. And such
systems will come...
2) The PowerUP-Hardware is not good for true Multi-Processoring. As soon as
your 68k/PPC tasks share memory, you will get serious problems. I won't get
into detail, it was discussed enough in the newsgroup. And it really is not
worth the effort.
I seriously recommend to work only "synchrone", doing Sub-Tasks only on the
same CPU the mainprogram also is running on.
Sometimes it is also useful to do a manual Contextswitch to a 68k ASM function.
If the ASM functions contains tons of OS calls, for example. But if you have such
code, i recommend using a Mixed Binary, anyways. Makes things more easy.
PowerPC ASM Optimization
------------------------
At last this one. Again i have to say, that it makes no sense to implement
the whole stuff in PPC ASM. You start like this:
1) Implement all in C
2) Compile it for 68k and use the Profiler of StormC (the profiler currently
only exists for 68k, but its data is also useful for PPC)
When you use the profiler you run the program, and it does a statistic about
which functions use how much CPU time. Then you implement the functions that
take the most CPU time in PPC ASM. It is that simple.
You have to keep in mind, though:
- even ASM can't speedup massive numbers of Context-Switches
- ASM also can't speed up the slow GFX Bus of the Amiga (Even Zorro III is
slow as to today's standards...)
Remember always:
Doing a fast implementation in C and then using a Profiler to find out which
functions are worth a ASM Optimization is much more clever than doing everything
in PPC ASM.
Of course the profiler is only available, if you own StormC. SAS/C and GNU C
do not have a profiler.
Now, what do you do, if your "original" source is in ASM, not in C ? Well,
you insert timing checks and write some timing data to a file ("manual
Profiling") at places where you think the most time is wasted. Of course,
real profiling (using StormC) is much more easy. Also remember, that C
defines it's functions like:
_Functionname
So if you want to profile ASM-Stuff you have to add a leading _ to all functionnames,
and to XDEF them all.
Example:
stuff.asm
---------
start:
jsr morestuff
; lots of code
rts
; lots of functions
morestuff:
; lots of code
rts
Would have to be changed to:
startit.c
---------
extern void start(void);
void main()
{
start();
}
stuff.asm
---------
XDEF _start
XDEF _morestuff
;... lots of functions
_start:
jsr _morestuff
;lots of code
rts
_morestuff:
;lots of code
rts
Well, and now you can start profiling... the C thing simply starts the ASM
main function...